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1 . (Previously Presented) A programmable processor for executing a plurality of 
programs, said programmable processor comprising: 

an execution pipeline having an average pipeline latency of one instruction per 

cycle; and 

an interleaver for interleaving instructions from said plurality of programs and providing 
said instructions to said pipeline for execution such that the number of said plurality of 
programs that are interleaved is greater than or equal to the depth of the pipeline. 

2. (Previously Presented) The processor of claim 1 or claim 23 wherein said pipeline 
has a datapath with a depth equal to said number of programs. 

3. (Previously Presented) The processor of claim 1 wherein a next instruction from 
one of said plurality of programs is not provided to said pipeline until a previous 
instruction of said one of said plurality of programs has completed. 

4. (Previously Presented) The processor of claim 1 or claim 23 wherein each program 
of said plurality of programs is independent of the other of said plurality of programs. 

5. (Previously Presented) The processor of claim 1 or claim 23 further including an 
output buffer for storing out of order data output. 
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6. (Previously Presented) The processor of claim 1 or claim 23 further including one 
or more of a register copy, program counter, and program counter stack provided for each 
of said plurality of programs. 

7. (Previously Presented) The processor of claim 6 wherein one or more of control 
and computing resources, instructions, instruction memory, data paths, data memory, and 
caches are shared by said plurality of programs. 

8. (Previously Presented) The processor of claim 1 or claim 23 wherein said 
processor executes SIMD vector instructions of vector length N and executes in parallel a 
plurality of instructions having SIMD vector lengths that suni up to N. 

9. (Previously Presented)' The processor of claim 1 or claim 23 wherein said 
instructions comprise load instructions for loading data from a data memory. 

10. (Previously Presented) The processor of claim 1 or claim 23 wherein said 
instructions comprise store instructions for storing data in a memory. 

1 1 . (Previously Presented) The processor of claim 9 wherein said data memory 
comprises a cache. 

12. (Previously Presented) The processor of claim 9 wherein address space of said data 
memory comprises a frame buffer unit. 
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13. (Previously Presented) The processor of claim 9 wherein address space of said data 
memory comprises a texture memory unit. 



14. (Previously Presented) A method of executing instructions from a plurality of 
programs comprising: 

identifying N programs of said plurality of programs; 

interleaving instructions from said N programs in a processor pipeline wherein said 
pipeline has an average latency of one instruction per cycle; and 

executing said instructions such that a first instruction from one of said N programs is 
completed before beginning execution of a second instruction of said one of said N 
programs wherein no no-op is inserted into the pipeline for the purpose of ensuring that 
said first instruction is completed before beginning execution of said second instruction. 

15. (Previously Presented) The method of claim 14 or claim 24 frirther including the 
step of assigning a program counter to each of said N programs. 

16. (Previously Presented) The method of claim 14 or claim 24 fiirther including the 
step of assigning a register to each of said N programs. 

17. (Previously Presented) The method of claim 14 or claim 24 wherein said graphics 
processing execution pipeline has a depth of N. 
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18. (Previously Presented) The method of claim 14 or claim 24 wherein said processor 
executes SIMD vector instructions of vector length N and executes in parallel a plurality 
of instructions having SIMD vector lengths that sum up to N. 

23. (Previously Presented) A programmable processor for executing a plurality of 
programs, said programmable processor comprising: 

an execution pipeline having an average pipeline latency of one instruction per cycle; and 

an interleaver for interleaving instructions from said plurality of programs and providing 
said instructions to said pipeline for execution wherein a next instruction from one of said 
plurality of programs is not provided to said pipeline imtil a previous instruction of said 
one of said plurality of programs has completed and wherein no no-op is inserted into the 
pipeline for the purpose of ensuring that said next instruction is not provided to said 
pipeline until said previous instruction has completed. 

24. (Previously Presented) A method of gathering instructions from a plurality of 
programs comprising: 

identifying N programs of said plurality of programs wherein N is greater than or equal to 
the depth of a processor pipeline; 

interleaving instructions from said N programs in said processor pipeline; and 
executing said instructions. 
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25. (Previously Presented) A programmable processor for executing a plurality of 
programs, said programmable processor comprising: 

a target program counter coupled to a plurality of program coxmters; 

each of said plurality of program coimters coupled to an instruction memory; 

instructions from said instruction memory coupled to an instruction decode; 

said decode coupled to a plurality of registers; 

each of said plurality of registers coupled to an operand route; 

said operand route coupled to an arithmetic datapath; 

said datapath and an output of a data memory coupled to a result route; and 

an output of said result route fed back to each of said plurality of registers. 

26. (Previously Presented) The programmable processor of claim 25 wherein said 
plurality of program counters is equal to said plurality of programs to be interleaved. 

27. (Previously Presented) The programmable processor of claim 25 wherein said 
plurality of registers is equal to said plurality of programs to be interleaved. 

28. (Previously Presented) The programmable processor of claim 25 wherein said 
plurality of registers is more than said plurality of programs to be interleaved. 
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29. (Previously Presented) The programmable processor of claim 25 wherein said 
instruction memory is larger than needed to hold said plxirality of programs. 



30. (Previously Presented) The programmable processor of claim 25 wherein said data 
memory is larger than needed to hold a data set accessed by each of said plurality of 
programs. 

3 1 . (Previously Presented) The programmable processor of claim 25 wherein each of 
said plurality of registers is double buffered and contains twice as many copies of 
registers as said plurality of programs. 

32. (Previously Presented) A method of executing one or more complex or compound 
instructions from a plurality of programs, comprising: 

implementing said instructions in one or more pipelined units wherein each of said 
instructions is issued to said one or more imits in each cycle. 

33. (Previously Presented) A method of executing one or more instructions from a 
plurality of programs, comprising: 

assigning a first output register slot to a first of said plurality of programs; 

executing said one or more instructions of said first program until said first program is 
completed; 
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loading output of said first program into its reserved space when said first program is 
completed; 

checking to see if all of said plurality of programs are completed; 

checking to see if a second output register slot is available to assign to a second program 
ft"om said plurality of programs when said first program is completed; 

checking to see if one or more instructions are available when at least one of said 
plurality of programs is not completed; and 

placing an no-op when no more instructions are available or said second output register 
slot is not available. 
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